fix(codemod): batch real-world v1→v2 migration bugs#2376
fix(codemod): batch real-world v1→v2 migration bugs#2376felixweinberger wants to merge 1 commit into
Conversation
🦋 Changeset detectedLatest commit: 83f3b09 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
@modelcontextprotocol/client
@modelcontextprotocol/codemod
@modelcontextprotocol/core
@modelcontextprotocol/server
@modelcontextprotocol/server-legacy
@modelcontextprotocol/express
@modelcontextprotocol/fastify
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
6175b71 to
83f3b09
Compare
| for (const [filePath, used] of perFileUsed) { | ||
| if (used.size === 0) continue; | ||
| const owner = ownerOf(filePath); | ||
| if (!owner) continue; | ||
| let bucket = result.get(owner); | ||
| if (!bucket) result.set(owner, (bucket = new Set())); | ||
| for (const p of used) bucket.add(p); | ||
| } | ||
|
|
||
| // Manifests that own no changed files (or whose files only used packages later pruned by the | ||
| // surviving-specifier check) still get an entry so the v1 SDK is removed and zod is checked. | ||
| for (const p of packageJsonPaths) { | ||
| if (!result.has(p)) result.set(p, new Set()); | ||
| } | ||
| if (fallback && !result.has(fallback)) result.set(fallback, new Set()); |
There was a problem hiding this comment.
🔴 In a workspace where @modelcontextprotocol/sdk is declared only in the root package.json (hoisted) while member packages have their own manifests without it, attributeUsedPackages() assigns each member file's v2 packages to the member manifest, which applyPackageJsonUpdate() then skips (no v1 SDK dep) — so the additions are silently dropped, while the root manifest only gets an empty set and loses the v1 SDK without gaining any v2 package. The pre-PR updatePackageJson() handled this layout correctly; un-appliable attributions should bubble up to the nearest ancestor manifest that actually declares the v1 SDK dep (or the fallback).
Extended reasoning...
What the bug is. attributeUsedPackages() (packages/codemod/src/utils/packageJsonUpdater.ts:29-67) attributes each file's surviving v2 packages to the nearest enclosing package.json discovered by the walk. applyPackageJsonUpdate() then returns undefined for any manifest that does not carry the v1 @modelcontextprotocol/sdk dependency (if (!inDeps && !inDevDeps) return undefined). When those two facts combine — a member manifest owns the changed files but the v1 dep is hoisted to the workspace root — the member's attributed v2 packages are silently discarded, and the root manifest only ever receives an empty Set from the "manifests that own no changed files still get an entry" loop.\n\nCode path / step-by-step proof. Workspace layout:\n\n\npackage.json # { dependencies: { "@modelcontextprotocol/sdk": "^1.0.0" }, workspaces: ["packages/*"] }\npackages/svc/package.json # { name: "svc" } ← no SDK dep (hoisted to root)\npackages/svc/index.ts # import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js'\n\n\n1. collectSourceFiles() discovers both manifests, so ownerDirs contains packages/svc and the root.\n2. packages/svc/index.ts is rewritten to import @modelcontextprotocol/server; perFileUsedPackages records { '@modelcontextprotocol/server' } for that file.\n3. ownerOf(packages/svc/index.ts) returns packages/svc/package.json (deepest enclosing manifest), so @modelcontextprotocol/server is bucketed under the member manifest.\n4. applyPackageJsonUpdate(packages/svc/package.json, {'@modelcontextprotocol/server'}) reads the member manifest, finds no @modelcontextprotocol/sdk in deps/devDeps, and returns undefined — the addition is dropped, nothing is written.\n5. The root manifest gets only an empty set via the leftover loop, so applyPackageJsonUpdate(root, ∅) deletes @modelcontextprotocol/sdk from the root and adds nothing (packagesToAdd is empty).\n\nNet result: the source files now import @modelcontextprotocol/server / /client / /node, but no manifest anywhere declares them and the v1 SDK has been removed — the migrated workspace cannot install the SDK at all. One verifier reproduced this empirically: final root dependencies was {}.\n\nWhy this is a regression. Before this PR, run() collected all used v2 packages globally and called updatePackageJson(targetDir, allUsedPackages), which resolved findPackageJson(targetDir) — the root manifest carrying the v1 dep — and added every used v2 package there. The hoisted-root layout (declaring shared deps once at the workspace root, common with npm/yarn workspaces) worked correctly pre-PR and breaks post-PR. The new #150 test in realWorldGaps.test.ts only covers the layout where every member declares its own SDK dep, so this case is untested. The same dropping also happens for nested "marker" manifests (e.g. a subdirectory package.json containing only {"type": "commonjs"}) that own files but carry no SDK dep.\n\nImpact. Silent dependency loss: the run reports success, the per-member output shows nothing, and the project only fails at the next npm install / build with unresolvable @modelcontextprotocol/* imports — and since the v1 SDK was also removed, there is no working fallback.\n\nHow to fix. When applyPackageJsonUpdate() would skip a manifest because it lacks the v1 dep, bubble that bucket's packages up to the nearest ancestor manifest (among packageJsonPaths / the fallback) that actually declares @modelcontextprotocol/sdk, and apply them there. Alternatively, have ownerOf() (or a pre-pass) check the candidate manifest for the v1 dep and keep walking up when it is absent. Either way, add a hoisted-dep workspace fixture to realWorldGaps.test.ts asserting the root manifest gains the used v2 packages.
| */ | ||
| function restoreFileHeader(sourceFile: import('ts-morph').SourceFile, originalHeader: string): void { | ||
| if (originalHeader.trim() === '') return; | ||
| const newText = sourceFile.getFullText(); | ||
| if (newText.startsWith(originalHeader)) return; | ||
|
|
||
| let body = newText; | ||
| const displacedAt = body.indexOf(originalHeader); | ||
| if (displacedAt !== -1) { | ||
| body = body.slice(0, displacedAt) + body.slice(displacedAt + originalHeader.length); | ||
| } | ||
| body = body.replace(/^[\r\n]+/, ''); | ||
| sourceFile.replaceWithText(originalHeader + body); | ||
| } |
There was a problem hiding this comment.
🔴 restoreFileHeader() calls sourceFile.replaceWithText() before the runner resolves diagnostic lines via d.resolveCurrentLine(), which forgets every node in the file — the resolvers throw and silently keep snapshot lines computed on the header-less text, so insertDiagnosticComments() places @mcp-codemod-error markers off by the restored header's height (e.g. inside the restored JSDoc block, which can also corrupt the comment syntax). Fix by adjusting diagnostic lines for the prepended header height, running the restore after line resolution, or using offset-preserving insertion instead of full-text replacement.
Extended reasoning...
The bug. restoreFileHeader() (runner.ts:90-103) repairs a dropped/displaced file header by calling sourceFile.replaceWithText(originalHeader + body). In ts-morph, replaceWithText on the SourceFile forgets every existing node in the file. The very next thing the runner does (the loop right after if (fileChanges > 0) restoreFileHeader(...)) is resolve diagnostic line numbers via d.resolveCurrentLine() — and those resolvers (created by actionRequired() in utils/diagnostics.ts) close over node references captured during the transforms. After replaceWithText, every such node throws InvalidOperationError, the catch silently keeps the snapshot d.line, and the resolver is deleted.
Why the snapshot line is wrong. The snapshot line = node.getStartLineNumber() was taken while the header was missing (the typical sequence: the JSDoc header is dropped early by importPaths/symbolRenames/removedApis, while the insertComment diagnostics — the new ServerContext stale-property markers, .code-on-SdkHttpError markers, destructuring warnings — come from later transforms). restoreFileHeader then re-prepends the multi-line header, shifting every statement down by the header's height. The kept snapshot line is now off by exactly that many lines.
Concrete walkthrough. Take the #164 fixture plus one helper that triggers the new contextTypes marker:
/** // 4-line JSDoc header
* Twenty lines of design documentation...
* that must not be silently deleted.
*/
import { RequestHandlerExtra } from '@modelcontextprotocol/sdk/shared/protocol.js';
export function helper(extra: RequestHandlerExtra<...>) { return extra.sendRequest(...); }- importPaths/symbolRenames remove the rewritten import — the JSDoc header (its leading trivia) goes with it. The file now starts at the helper.
- contextTypes flags
extra.sendRequestwith anactionRequiredinsertComment diagnostic; its snapshot line is 3 (header-less coordinates). restoreFileHeaderre-prepends the 4-line header viareplaceWithText— the parameter node now sits on line 8, but the captured node is forgotten.- The resolver throws; the runner keeps line 3.
insertDiagnosticComments()inserts/* @mcp-codemod-error ... */at line 3 — inside the restored JSDoc block. Verifiers reproduced this end-to-end: the inserted comment's*/terminates the surrounding/** ... */early, leaving the remaining doc lines as stray tokens, i.e. syntactically broken output. TheisInsideLiteraland already-marked dedup checks also evaluate the wrong line, andformatDiagnosticreports the wrongfile:line.
Why nothing catches it. The trigger is precisely the combination the restore feature exists for — header dropped by a later transform plus an insertComment diagnostic in the same file — and the new tests exercise these separately but never together (the #164 fixtures contain no actionRequired diagnostics), so 407/407 passing doesn't cover it. All four verifiers confirmed; two reproduced it empirically against ts-morph 28.
How to fix. Any of: (a) move the restoreFileHeader call to after the resolveCurrentLine loop and add the prepended header's line count to each diagnostic's resolved line; (b) keep the order but compute the height delta (header line count when the header was dropped) and adjust d.line for diagnostics whose resolver failed; or (c) restore the header with an offset-preserving insertText(0, originalHeader) (plus removal of any displaced copy via targeted edits) instead of full-text replaceWithText, so existing nodes stay alive and the resolvers return correct lines.
| const ownerDirs = packageJsonPaths.map(p => path.dirname(p)).toSorted((a, b) => b.length - a.length); | ||
| const result = new Map<string, Set<string>>(); | ||
|
|
||
| // A run pointed at a sub-directory may not contain its own package.json — fall back to walk-up. | ||
| const fallback = findPackageJson(targetDir); | ||
|
|
||
| const ownerOf = (filePath: string): string | undefined => { | ||
| const dir = path.dirname(filePath); | ||
| for (const ownerDir of ownerDirs) { | ||
| if (dir === ownerDir || dir.startsWith(ownerDir + path.sep)) return path.join(ownerDir, 'package.json'); | ||
| } | ||
| return fallback; | ||
| }; |
There was a problem hiding this comment.
🔴 On Windows, attributeUsedPackages() never matches a source file to its owning workspace manifest: perFileUsed keys come from ts-morph's getFilePath() (always forward slashes), while ownerDirs come from path.join()-built packageJsonPaths (backslashes), so dir === ownerDir / dir.startsWith(ownerDir + path.sep) can never be true. Every file falls back to findPackageJson(targetDir), so workspace member package.json files get the v1 SDK removed but no v2 packages added, and the v2 packages are silently dropped if the fallback manifest has no v1 SDK dep — normalize separators on both sides before comparing (as fileWalk.ts already does for ignore matching).
Extended reasoning...
What the bug is. attributeUsedPackages() in packages/codemod/src/utils/packageJsonUpdater.ts decides which package.json owns each changed source file by comparing path.dirname(filePath) (where filePath is a key of perFileUsed) against ownerDirs derived from packageJsonPaths. These two sides come from different path conventions. The perFileUsed keys are set in runner.ts via perFileUsedPackages.set(sourceFile.getFilePath(), surviving), and ts-morph's getFilePath() returns a StandardizedFilePath — always forward-slash separated, even on Windows (e.g. C:/repo/packages/svc/index.ts). The packageJsonPaths, however, are produced by collectSourceFiles() in fileWalk.ts using native path.resolve/path.join, which yield backslash paths on Windows (C:\\repo\\packages\\svc\\package.json), and ownerDirs = path.dirname(p) keeps those backslashes.\n\nThe code path. In ownerOf(), dir = path.dirname(filePath) — path.win32.dirname accepts forward slashes but does not convert them, so dir stays forward-slashed (C:/repo/packages/svc). The comparison dir === ownerDir || dir.startsWith(ownerDir + path.sep) (with path.sep === '\\' on Windows) compares a forward-slash string against a backslash string and can therefore never match. Every file silently takes the fallback = findPackageJson(targetDir) branch instead of being attributed to its enclosing workspace member.\n\nWhy nothing else prevents it. Nothing normalizes the separators before the comparison. The PR is clearly Windows-aware elsewhere — fileWalk.ts does path.relative(root, full).replaceAll('\\\\', '/') so --ignore globs match on Windows, and the pre-existing runner code relied on getFilePath()'s forward slashes (fp.includes('/node_modules/')) — so this is an oversight in the new attribution code, not a deliberate POSIX-only design. The new realWorldGaps.test.ts monorepo fixture only runs on POSIX temp dirs in CI, so the mismatch is invisible to the test suite.\n\nImpact. This silently breaks the PR's headline monorepo feature (#150) on Windows. Concrete walk-through for the exact fixture in the new test, run on Windows:\n1. collectSourceFiles returns packageJsonPaths = ['C:\\ws\\package.json', 'C:\\ws\\packages\\svc\\package.json', 'C:\\ws\\packages\\cli\\package.json']; ownerDirs = ['C:\\ws\\packages\\svc', 'C:\\ws\\packages\\cli', 'C:\\ws'].\n2. The runner records perFileUsedPackages = { 'C:/ws/packages/svc/index.ts' → {'@modelcontextprotocol/server'}, 'C:/ws/packages/cli/index.ts' → {'@modelcontextprotocol/client'} } (forward slashes from getFilePath()).\n3. ownerOf('C:/ws/packages/svc/index.ts') computes dir = 'C:/ws/packages/svc'; neither dir === 'C:\\ws\\packages\\svc' nor dir.startsWith('C:\\ws\\packages\\svc\\') is true for any owner, so it returns fallback = 'C:\\ws\\package.json' (the workspace root).\n4. Both files' v2 packages land in the root manifest's bucket. The root has no @modelcontextprotocol/sdk dependency, so applyPackageJsonUpdate returns undefined — the additions are dropped entirely.\n5. The member manifests (svc, cli) only get an entry from the "manifests that own no changed files" loop with an empty used-package set, so applyPackageJsonUpdate removes @modelcontextprotocol/sdk from each but adds no v2 packages (added: []).\n6. Net result on Windows: every member package.json loses its SDK dependency and gains nothing — the migrated workspace cannot install/build, and there is no error or warning telling the user why.\n\nHow to fix. Normalize the separators on both sides before comparing — e.g. convert both dir and each ownerDir with .replaceAll('\\\\', '/') (mirroring what fileWalk.ts already does for ignore matching) and compare with '/' instead of path.sep, or build ownerDirs from the standardized form in the first place. A one-line normalization in ownerOf() is sufficient.
| // v1's `StreamableHTTPError.code` carried the HTTP status (number); v2's `SdkHttpError.code` is the | ||
| // `SdkErrorCode` enum string and the status moved to `.status`. The class rename below keeps | ||
| // `error.code === 404` compiling (TS2367 if `error` is narrowed to `SdkHttpError`, otherwise no | ||
| // error) but always-false at runtime — the silent-misclassification class. Mark every `.code` | ||
| // access on an identifier that is `instanceof`-checked against this class so the user is steered to | ||
| // `.status` at the exact site, not just the file-level summary warning. | ||
| const instanceofSubjects = new Set<string>(); | ||
| for (const be of sourceFile.getDescendantsOfKind(SyntaxKind.BinaryExpression)) { | ||
| if (be.getOperatorToken().getKind() !== SyntaxKind.InstanceOfKeyword) continue; | ||
| const right = be.getRight(); | ||
| if (!Node.isIdentifier(right) || right.getText() !== localName) continue; | ||
| const left = be.getLeft(); | ||
| if (Node.isIdentifier(left)) instanceofSubjects.add(left.getText()); | ||
| } | ||
| for (const pa of sourceFile.getDescendantsOfKind(SyntaxKind.PropertyAccessExpression)) { | ||
| if (pa.getName() !== 'code') continue; | ||
| const obj = pa.getExpression(); | ||
| if (!Node.isIdentifier(obj) || !instanceofSubjects.has(obj.getText())) continue; | ||
| diagnostics.push( | ||
| actionRequired( | ||
| sourceFile.getFilePath(), | ||
| pa, | ||
| `\`${obj.getText()}.code\` on an SdkHttpError is the SdkErrorCode string in v2; the HTTP status is on ` + | ||
| `\`${obj.getText()}.status\`. Update this check (e.g. \`.code === 404\` → \`.status === 404\`).` | ||
| ) | ||
| ); | ||
| } | ||
|
|
There was a problem hiding this comment.
🟡 The .code marker scan in handleStreamableHTTPError matches identifiers by name only across the whole file, so an unrelated catch (error) block that checks error.code === ErrorCode.InvalidParams on a McpError/ProtocolError (still valid in v2) gets a misleading "use .status" @mcp-codemod-error comment just because the catch variable shares the name. Limiting the scan to .code accesses lexically inside (or guarded by) the instanceof check — or at least the same enclosing function/catch clause — would avoid the false positive.
Extended reasoning...
What the bug is. handleStreamableHTTPError builds instanceofSubjects from every x instanceof <localName> binary expression anywhere in the file, keyed solely by the left-hand identifier's text (removedApis.ts:148-155). The second loop (lines 156-168) then emits an actionRequired diagnostic — which carries insertComment, so an @mcp-codemod-error comment is physically inserted into the user's code — for every PropertyAccessExpression named code whose object identifier matches that text, anywhere in the file. There is no scoping to the instanceof-guarded expression, the enclosing block, or even the same function.
The code path that triggers it. Catch variables are conventionally named error or e, and a typical v1 client file has multiple catch (error) blocks. Consider:
// function A
try { await transport.send(msg); } catch (error) {
if (error instanceof StreamableHTTPError && error.code === 404) { /* retry */ }
}
// function B — completely unrelated
try { await client.callTool(req); } catch (error) {
if (error instanceof McpError && error.code === ErrorCode.InvalidParams) { /* report */ }
}Step-by-step: (1) the instanceof StreamableHTTPError in function A adds the identifier text "error" to instanceofSubjects; (2) the property-access loop scans the whole file for <id>.code where the id text is "error"; (3) the access in function B — error.code === ErrorCode.InvalidParams, where error is a McpError/ProtocolError — matches by text and receives an action-required marker telling the user that error.code "is the SdkErrorCode string in v2; the HTTP status is on .status". That advice is wrong at this site: ProtocolError.code checks against ErrorCode/ProtocolErrorCode are still correct in v2, and following the inserted instruction (switching to .status) would break working code. One verifier confirmed this empirically by running the transform on a two-catch-block fixture and observing two markers, including one on the McpError block.
Why existing code doesn't prevent it. The only negative test added in this PR ("does not flag .code on an unrelated identifier") uses a differently named identifier (other.code vs e instanceof …), so the same-name case — the common one in real code — is untested. The contextTypes catch-all marker added in this same PR scopes its heuristic to the body of the function owning the typed parameter, so a tighter scope is clearly achievable here too.
Impact. This is a false-positive advisory marker, not a wrong rewrite — the codemod does not change the access itself. But the inserted @mcp-codemod-error comment is action-required by convention and gives actively wrong migration guidance at a correct call site, in a pattern (multiple catch (error) blocks, one HTTP-error check plus one protocol-error-code check) that shows up in essentially every real client codebase.
How to fix. Limit the second scan to .code accesses that are lexically contained in the instanceof-guarded region: e.g. only flag accesses within the same binary expression / && chain as the instanceof check, or within the if/block whose condition contains it — or, as a minimal improvement, restrict matching to the same enclosing function or catch clause as the instanceof occurrence rather than the whole source file.
Fixes nine codemod bugs that surface on real-world v1→v2 migrations.
zod: "^3.x"left as-is with no warningactionRequireddiagnostic emitted (range left untouched; user owns the bump)package.jsonupdated in a pnpm workspaceimport('…').Ttype-position qualifiers left pointing at v1 pathsas/parens, or assigned tofallbackRequestHandler, skipped by theextra→ctxremape.code === 401after theStreamableHTTPError→SdkHttpErrorrename compiles but is always false (codeis now an enum string)@mcp-codemod-errormarker emitted at the accessz.object()wrapper emitted without addingimport { z } from 'zod'require.resolve('@modelcontextprotocol/sdk/...')left pointing at removed pathsrequirecondition exists in v2)/** */or//block deleted/displaced when imports are rewritten below itELOOPcrash on pnpmnode_modulessymlink cycles;--ignorenot honored during descent--ignoreapplied during traversalHow Has This Been Tested?
One fixture test per row. 407/407.
Breaking Changes
None.
Types of changes
Checklist